«D 20S sua 

liOTHOB 

mil 

aPOHS AG^NCt 

'rwpobt ho 

POB DATB 
COHTfiUfCT 

TiyEtlTIPIBHS 



OOCOHBIIT RWOIIl / \ ^ 

I / • TM 910 «21 

Detection of iWrrant lJe«pon»9([ Patttras aai !?hmlr ^ 
Effect 6n Oiienalonalltyi, 1 ^ 
llH|ft>l« Onlv*^ orbana. coMp^tar^Based Bdupatlon ' 
Peaearbh Lab, - V-;/' • ' ^ . ' ;^ 

Office of Haval Resea*ch,;Ktllngtoar Va. ^arsonnel 
and Training Pedearch Prograna Offlco^ 

Apr BO • ^ - . 

ll0001tl-79-C-0?52 ii ^ " 

.HF01/PC03 pius **8ta^e* f 

♦Error Patterns; ♦Goodness of fit; Item Analysis; 

Hathenatlcal Hodels; Teat Construction; Teat/ 

Bellabimy 

Individualized' Conalst en cy lndei^; ♦Pattern Conf orally 
Index; ♦Onldlaenslonal' Scaling 



ABSTRACT / ... ' . \ ' - 

An Index raeasurlnq the degree to which a binary 
response pattern^ conforms to some baseline pattern was defined and' 
named the Pattern Canformlty Index CPClV. One way. of conceptualizing 
what the PCI measures Is the extent to which each individual's 
particular response pattern contributes to,' or detracts from^ .the 
overall consistency found in the group's mode of responding. One use 
of the PCI consists of spotting anomalous response paVterns that 
result from a studeijtt's problems. Prom this it is a short st'ep to 
utilizina the PCt for identifying a subgroup of students for whoa th-e 
given set of items appropriately constitutes a unldlmensionally 
scalable set. .;^he duaHLty^between students and items then permits 
selection of a subset of items for further improving the 
unidimensibnailty, A measure of how constant an ladlvidual's response 
pat*-ern r^a^ins for parallel subsets of items pcurrtng earlier and 
later ii/ a test was developed and called the Individualized 
ConsJ.s^ncy Index (ICI) . (Author/BWI ^ 



^ j|i4r « * 4i 4i « 4i III 111 4i 111 « 4i 4i « « « 111 « 111 « 111 111 « 4i 4i 4( III 111 111 

♦ Peproductlons supplied by EDRS are the best that can be made 

♦ from th^. original document. 



,trv 
'o 



NATlONAUNITITUTiO^ 
iOUCATlON 

^fcNla»>ICIAl. NAIIONAl INUlIUIii (>• 
eOUCAYlON HQJlf.ON OH MolVt v 



if-' 



V 



a. 



ERIC 




vumput«rr bated Education 



R«t«arch Laboratory' 




A A 4*.A *4 A AAAA.A4/ >^ 



University of lllinoit 



A- 



Urbono lllinoi 



DETECTION OF 
ABERRANT RESPONSE PATTERNS 

AND 

THEIR EFFECT ON DIMENSIONALITY 

lie neoAaTMl 



"PERMISSION TO REPRODUCE THIS 
MATERIAL HAS BEEN GRANTED BY 



TO THE EDUCTATIQNAL RESOURCES 
INFORMATION CENTER (ERIC)/* 



KIKUMI K. TATSUOKA 
MAURICE M. TATSUOKA 



U.8, DEPAHTMEM OF EOUCATION 

NATIONAL INSTITUTE OF EDUCATION 
EDUCATIONAL RESOURCES INFORMATION 

CENTER (ERIC) 
X This documont has boon reproducod as 
foceivod from the person or organiiation 
originating it. 
; 1 Minor' changes havo been made to improve 
" reproduction quality, 

• Points of view or opinions stated in this docu * 
mont do not necessarily represent official NIE 
y' position or policy. 



Approved for public r;elea8e; distribution unl|Linlted. 
Reproduction in whole or in part permitted ^r any 
purpose of the United States Government. ' 



This research was sponsored by the Personnel and Training 
Research Program, Psychological Sciences Division^ Office 
of Naval Research, under Contract No. NOOOr4-79-C-0752. ^ 
Contract Authority Identification Number NR 150-415. 



COMPUTERIZED ADAPTIVE TESTING AND MEASURSMENT 



RESEARCH REPORT 80^4 



APRIL 1980 



HCU^lfV CWMIIflCATIQN Qf THU P^<^t. (Whm Otm inl«f«iO 



REPORT DOCUMENTATION PAGE > 


RWAD INBTRUCTIONft 
BEFORK COMPLRTINQ FORM 


^. hiponr NUMiVR t aoVT ACCiiaidN NO* 
ReiMrch Report No. BO'*^ 


1. RICIP^KNT'I CATAUOa NUHftKR 


Detection of Aberrant Response Patterns 
and their Effect on Dimensionality 


I.. TVPK OF HKPORT ft PEf^lOO COV«HEO 

Nov, 21 '79 - F«b^ 20 '80 


ft. PERFORMING ORO. REPORT NUMBER 

CERL Report E-15 


KIkumI TatSMoka Maurice Tatsuoka 


1. C0NTi<iA<l:f 6N 6KAN'f NUMtti^l^i'i; 
N0001i*-79-C-0752 


t, ^J(PironM}NO.O»>|QAN|2ATION NAMI AND ADDRB8I 

Computer-based Education Research Laboratory 
University of Illinois 
Urbana, IL 61801 


10, PROORAM ELEMENT. PROJECT, TASK 
AREAv* WONK UNIT NUMBERS 

61153N; RR0^2-04 
RR0^2-0i#-01; NR15^-^^5 


n. CONTROLDNO OFFICE NAME AND ADDflBaS 

\ Personnel and Training Re.search Laboratory 
Office of Naval Research (Code i*58) 
Arllnaton. VA . 22217 


12, REPORT DATE 

April 1980 


IS, NUMBER OF PAOES 


' MONITORING AGENCY NAME ft AOOHESS( il diiUrttt /rom ControtUng 0//Jc«; 

♦ 


15, SECURITY CLASS, Corr/iJ* r«porr; 


t5«. DECLASSIFICATION/DOWNORADINO 
SCHEDULE 


16. DISTRIBUTION STATEMENT fo/ (hi* I{«por0 

Approved for public release; distribution unlimited 


17. DISTRIBUTION STATEMENT fo' •bmtrmel unfrmd In Blook 20, U ditUtwni trom R9porty 


le. SUPPLEMENTARY NOTES 


(S. KEY WORpS (Continum on r9Vmi99 mido U •mry mid identity by block numbor) 

error analysis, Hespon^e patterns. Integer operations, order analysis, 
consistency Index, pattern conformity Index, subset of unid imensTonal ity 


20.^ ABSTRACT (Cpntlnuo on rovroo oldo it nocmoomry and idontify by block nxmbot) 

An Index measuring the degree to which a binary response pattern 
conforms'to some baseline pattern was defined and named the Pattern^ , 
Conformity Index (PCI). By ''baseline pattern" we mean a binary response 
vector with all the O's preceding the 1*s when the items are arranged in 
descending order of difficulty or In some other, purposefully defined 
order, ^ 



Dp', 



^2n""7, 1473 



EDITION OF t NOV 68 IS OBSOLETE 

S/N 0102.LF-014-6601 



Unclassi f led 



SECURITY CLASSIFICATION OF THIS P kOt. (Whon D9tm Enff0 

3 



DEC 3 0 » 



WCUWITV CWMIIUCATIOM or TMII fWllW PW> IhIwcmO 



1 



it w«i «hown that PCI U rolAteid to Cliff ^9 consUtency Indox C In tha 
/followlntf m^nnart Whon tha Itaina nr^ ^rrangad In a»candlng ordar of tha 
propQrtlono of IndlvlduaU In a grogp who pass tha Itanis^ and tha PCI U 
computad fcfr aach Individual p a cartain walghtad avoraga of thaaa PCIU 
yields an Indan which Is a slight modification of Cliff's C, 

Ona way of concaptual laling what tha f^CI maasures U tha axtant to which 
each IndlviduaPs particular rasppnsa pattern contributes to^ or detracts 
from, tha overall jbonslstancy found In the group's mode of responding. 

The foregoing observat Ions , make It clear that one ui^ of the PCI con- 
sists in spotting anomalous response patterns that result from a student's 
problems, for example. From th|s It Is a short step to uti 1 Izing the PCI 
for Identifying a subgroup of students for whom the given set of Items 
approximately qonstltutes a unldlmenslonal ly scalable set^. The duality 
between students and Items then permits selection of a subset of litems fqr 
further Improving the un Idlmenslonal I ty. 

Very roughly speeking, the culling out of students and/or Item^ to 
achieve unldlmenslonal Ity by using the PCI proceeds as' follows. Students 
whose response pattern are so anomalous (I.e., so atypical of the group) as 
to have negative PCI values are eliminated from the outset. The weighted 
average of the PCPs of the remaining members of the group (referred to 
above as i-esembling Cliff's C) Is computed. Then, In a manner somewhat 
an^lpgous to removing variables In the backward elimination method of 
stepwise multiple regression, students are successively removed from thcf 
group In such a way that at each step the PC I -weighted-average for the 
remaining group shows* the largest i^jcrement from the previous value. A 
suitable stopping rule terminates the process before the group gets 
Intolerably small In size. A computer routine for effecting this procedure 
was developed. 

The PCI measures the degree to which an individual's response pattern 
resembles the group's modal response pattern. Sometimes, however, we need 
a measure of how constant an individual's response pattern remains for 
parallel subsets of items ocuurring earlier and later In a test. One 
reason for this Is that students of ten swi tch their rules of operation ~ 
either from one erroneous irule to another or from an erroneous to the 
correct rule — as they proceed through a test. T|ius, their response 
patterns tend to be inconsistent among one another while learning is taking 
place, but become more and more consistent as they reach mastery level. An 
index for measuring individual consistency was developed and called the 
tndlvtdual tzed Consistency Index (ICI). 



ERLC 



Unclassl fled 



SECuniTV CtAiSiriCATION OF THIS PAOE(Wli«n Dmim Bnffd) 



ACKtiOWLEDGEMENTS . 



1 1/ 



We wish to acknowledge the^ervlces rendered by the ,fol lowing parsons. ^ 



> ■ 



Bob BallVIe, for designing and writing setyejraUcomputer programs. 7 , 
tehfcnici&lj, data processing, collection roi^tlnes as well as computerized^ 
Instruct lohal lessons. / ^ ' \ y 



,1 



I Roy ^Ll pshuta? and Wayne Wilson for their* artwork!,! 




ABSTRACT 



An Index measuring the degree to which e binary ro^pbnsa pattern 
conforma to soma baaf.Una pattern wai defined and named tjfe Paltern^ 
Conformity I n()ex IPC|), By •»ba$eMri^, pattern'' we mean a binary res^jonse 
vector with all the O's preceding the 1's when the Items are arranged ^ 
In descending order of dlfffculty;mr In ^ome other, puri:)03eful ly defined 
order# 

It was shown the PCI I s related J:o CI If f's cons jj^ency. index Cl 
Jn the followllig manner* When the I temi are arrana®^ 'nA^^ order 
of the proportions of Individuals In a group v^^pasi tnV^^^^ * 
the PCI Is computed for each IndlvlduaV, ^ ceirtjalfj wejgH'ted c^V^rage 
> .<5f|tRe*e t'C'l 's yellds an Jndex which -Jsfa Slight tiwdlfUatlon'Sf^ s- 

^ > Ort^e wa^^^f ^oncaptd^ tj^j|J5 the ^extent 

- to Which each lAdTvlcfuarl • s particular response patterrK contributes to, ^ 
, or detf^acts from^ the overal l qonslstency found In the group s mode of 

- ' <v . 'Z^' ■ ■ ■ ^ V. 

responding. • ^ • ^ \ . , 

The foregoing observations make It, clear thet one^se of the PCl/ 
consists of spotting ;anpmaloui^$3i^||^ps'e patterns that rk/ult from- 
a s'^tudeVit's problems, for example. Fronv t'hU It Is a sh^rt step to ' 
uttltilng the PCI for ldentlfy|/ig a subgroup of stud^nts^or whom the 
given-set of t tems approximately constitutes arunl|llmenslonal ly 

) ' ■ ' ^ ■ ■ •" ■• ^ - r 

scalable set. Jhe duality between students and Items then perm Kt;ri 
selection of a subset jof I tems for' further ^proving the unldlmenslon- 



■ ' ' - i ' 

<> ' 1 

th0 oroup) i^ii CO hhvcji n09«iUv& PCI Vfilii^^ ^r«* ^llmln^ti^d from th«i 
ouUttt* Th« wttlght^U civ«ir4g6 of th6 PCpti^of i\w rmrmlninu mt^inb*»i ti of 
,th« group (r#farr«d to i^bove ridaainbling qUff'!i C) U computml. Thun, 
In fi nwnni^r somewhat ant^ogou** to rafm^vlng y*irUbles In tha backward 
•"oMmlnatlon method of stepwise multiple regri|as5ilon, ntudenti* fire Jiucces^itvely 
renK)ved from the group In such a way that at \each step the PCI-walghted- 
averagei for the remaining group shows the lar^jeat Increment from the 
previous value. A suitable stopping rule terminates the process 
^ before th^group gets lntoJ,erably small In s\ze^ A Qomputer routine for 



effecting this procudre was developed^ 

The PCI measures the degree to hich an Individual's response 
pattern resembles the group's modal response pattern. Sometimes, however, 
we need a measure of how constant an Individual's fesponse pattern 
remains for parallel subsets of i terns ocurring earlier and later In a 
.test. One reason for this Is that students often switch their rules 
of operation either from one erroneous rule to another of from^an 

erroneous to the correct rule — - as they proceed ^hrouglj a test. Thus,^ 

- . ■ . ' ^ • ' , ' 
their response patterns tend to be i neons i stent ^mong dri^ another wh i le ; 

learning Is taking place, but become more and more consistent as they: 

\\ ■■ ^ ^ 

reach mastery level. An index for rneasurlng Individual ^consistency 
was developed and called th e Individualized Consistency Index (ICI). 



ERIC 



PATTERNS m THim ElfriCT ON aiHINSIONAi-ITV 

1 

INTRODUCTION 

. It; WiHilit «««NH tri t«i^ Co siliy thcil chci pusiiijblity of an 

imimtnttU qnUlnii aarr<»at iinsivifi$rt» for rhci wranq r^nnpn^i «i1w«iyi larku 

, ' '* ' 

b«iihlnil vltchotiKiKHi^ ly sicofiftil ( «i !i t J I cuY) Si mui thf^^tm)ii to iidtit r oy thci 
v^jlldliy of th<3i t«»«t iixcdpt for th^i f«ct th«it| until r«icanUy, thU 
postilbillty ha* largely baen, Ignorail by paycho«mtr Ic Un^i, Although^ 
scattered attempts have been made to give partial credit for partial 
knowledge^ procedures for d I acred I ting correct an?iwers* arrived at by 
Incorrect means 4iave largely been confined to the u$ie of formulati for 
correction for guessing. This paucity may not be devatitatlng for 
standardized ability tests> but Is practical ly fatal In the context of 
achlc^v^ent testing that Is an Integral part of the instructional 
process. There the test must serve the purpose of diagnosing what type 
of misconception exists, so that apfproprlate remedial Instruction may 
be given. This calls for delving Into the cognitive processes that are 
brought Into play In solving problems, and trying to pinpoint Just 
where the examinee went astray, even when the correct answer was 
fortuitously produced. 

This type of testing was pioneered by Brown & Burton (1978), 
whose celebrated BUG&Y Is essentially an adaptive diagnostic testing 
system which utilizes network theory for routing examinees through a 
set. of problems In the addition and subtraction of positive Integers, 



... ) 

whlcvh ilirf«*ifiil tim BUttaV in th^t ih«» w«is» nut ^^d^piivci but 

♦'i oMV^fH luH^PS rh<» w«lii ionsiiiuitml for utici In 

caiijUMCllofi w| Ui i«iii4un:i In lha <i«lil|c|ijn dtni iiubircii i Ion of :llun<)ii 
nuiHb«i!i (poaltlvtt ^ml n<«u«iilv<i Intuyar;*) for eighth uri*acs j^tiul^nla 
«in(l coniNtyu of fqtir p<ir|||lct| Jiubt^^t;* o^ 16 It^ii^ ©<ich, A ?iY»tmi of 
«rror v«^ctor$ w^ii clovdiopttil for dUynoialnu thrs typo{?i) of urror 

,\ 

Crucl^T)to thU systiwH of error dUnno^iH In the ability to 
tell whether and to what extent a renponsa pattern \% **typlcar* or 
"conaljitent^'. We m^y speak of consistency with re$pect to either 
the average response pattern of a group or an IndlvlduaPs own response 
pattern over time* To measure consistency In these two senses » two 
related but distinct Indices are developed In this paper. They are 
called the "Nofm Conformity Index" (NCl) and "Individual Consistency 
Index" (IQI), respectively. 

It Is shown that a certain weighted average of the NCI's of 
th^ monbers of a group yields onfe of Cliff's (1977) group tonslstency 
Indices, C^^, The higher the value of C^^ , the closer the group data 
set Is to being un Id Imens lona 1 In the sense of forming a Guttman scale. 
Response patterns produced by erronj^6uj|M^ ar;e usually quite 
different from the average response pattern. Hence> removing Individual 



3 



'tatlth law <M»,iMny (iii««lt»¥») NCI Vilimi> ■ I,*., thuaii wllh .^I^b.V^i^I 

)i9«(»iir««ii|t Hciu«rf)i« will yUM «i .i^i.* ^dt trwc (a, nwi a h«»<i»Iv 

ttiB *.r> the iithei t\mui , iii«aaMi ea iJio ilctqi tie lu wtid h 411 

InUI vldudl 's, r*ispo(ts»« fj«*tl«rf\ i«»Mlni| hwiiirUnt uwar time, lluis, fa, 
(iXcttiipU, it» Iho lyiiBil-aumtief; tcsa I \ utia U t I mj uf foMr (it^^ts^Mol 
atitttoKta. tho ICI In.ll.sitoa whdthrti an tikmltma*:, ns4|»«)tisd iMtleiii 
chdM.jttJ. maikddly from »hi« siihuot Ju tlits iiaxt ur iom'4liia ielrtHvf»ly 
atrtble. Uw ICI Vdliitts. Iiidh (,1 In.j Inatablllty at 1 «apon4B |i4tt«i ri, 
wotilil r.uuu««.t H\rtt th« «Kamli»Ba wda t>(|)l In thw oarly !.l4U«*> .if 
l«rttnlM»|, ihrtfi.jinu hU/htir matluul f.it solvlit.j oiiit I v« Imit |ii<il»lfm?. 
fnm on« wflv« to the n<eiMt. A high 1(11 valiio. r«n«i-ct Iri.j !»tab|IUy 
of rasporina pdttern, woy^ld il,jndl the neai Iny of mastery or a learnlmi 

^ Whilo the NCI and ICI.can e^ch sot've useful purpo!»as as 

suiiycsntttd *ibov« rtnd Illustrated In d«t«ll below, examlnlmj them jointly 
bpens up various diagnostic poss I bl 1 1 1 lea , as does the consideration of 

■:.«l,§ch of them In combination with the total test score. 



10 



««( of «tn aiHlrt yrtjup, th«v ««n b« •Mi}r*!i»«*l ii« w«l«ht«id nvwrrtana 
iiM4mln««i. (Krui. 1975; Mokk#n. IQ/Oj ¥«..Hiiiw>to & WUi». IS«0,) 

arou£ r«pr«$«rit ImMvl^u^l <ix«mln#<ift* cons»|«uiu:y 

of r«»ponft4iii« ' / 

Blr^nUum & TutVuoM (I980) cJainohgi^rdtiicI thut ImllvlcluAr rii»poni« 
p«tt«rnf offer powerful laform«t Ion for daUrmlnlny 4ny ^rrondou^ 
ruU of operaUon thut «i glvari 0x«mln«^ 'ni^y h«v« u$«d In tiikln^^ ^ 
t«$t/ * tiit$uok« £t ijX, (1980) d«V0lop6d a diagnostic %y%tmM for 
Identifying erroneous ru4ef by generating ''error vectors*\ each of 
whose binary elements represents the pVesence/absence of a specific 
•'atomic*^ error. In this paper we develop an Index that associates with 
each response-pattern vector a number between -I and I (tnclusl\e) 
representing the degree of concordance the vector shows with a Guttman 
vector of the same length (\\e., the same number of Ts) with the 
Items arranged In some purposefully specified order. For Instance, 
they may be arranged — as they are In computing Cllff^s Indices ^ 
'in descending order of difficulty for the total group; they ^ ^ 
may be arranged In any particular order ^hat suits a given purpose. 



lljff'a i^a»uUii*Hwy iH»M».«sai. In© V4 hie y(«iUli>4 t.y «»ai h uf I t I f f ' 5» 

« 

diNii|n<ifH«i Mi^t f tn will 4l?iu i^hfiirnia, Ct>r>^*«i|M*iMt ly, ^hti iiin«iNi«>Miy 
Index atiot lintnd with rnnponK! (Xit t «>t r) %, t|«f|iuitj .in 

(2) C - 2U /U - I , 

where U - Z I n (the sum of the above-d laqu'^^l eitJiMeiU'^ of N), 

i J>i 

and U - r n (the sum of all the elements of N) . 

Is a function of the I tern order, 0. To make this fact explicit, 

% 

we wr i te C (0) . ^ 

P . - ^ 



I > 



Example 1; Let S - (10110); then 



-0 ;i 



S'S -! 0 



0 

1 



d 0 0 0 0 

1 0 1 1 0 

0 0 0 p 6 

•0 0 0 0 0 

1 0 ill 6] ' 



Here U - Z Z n,. » 2 and U - 6 ; hence from Equation (2), 



Cp(0) - (2)(2)/6 - 1 - -1/3 . 



Example 2 ; Let S » (001 11), the Guttman Vector with three 1 



Then 



N - S'S - 



1 
1 

0 
0 
I 0 



(00111) 



fo 0 1 1 1 



0 0 1 1 1 

0 0 q 0 0. 

0 0 0 0 0 



0 0 to 0 



U_ - 6, U - 6 ; hence C (O) - 12/6 - 1 » 1 . 

« -P - 



13 



Example 3 ; - Let S » (11100), the "reversed Guttman vector". Then 

■ fo] . 

i 0 

,N = S'5 = 0 id 1100) = 



1 



0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 0 
1110 0 

1 1 1 0 0 



Here = 0,\U = 6; hence Cp(0) = 0/6 - 1 -1 . , 

From the foregoing examples, the first two of the following 
properties of cVo^ may be inferred. The other propertle;s are 
1 1 lustrated by further examples, and intuitive arguments are given 
to substantiate them. Their f^ormal proofs ar^ not difficult but; 
tedious, and are therefore oml tted. ' * . \ 



Property 1 ; 



-1 S C (0) < 1 

P )7 



Property 2 ; 



If thelDfcler of Items Is reversed In S, the 
absolute value of Cp(0) remains unchanged, 
but Its sign is reversed • 



Since U'= E S n = S Z (1-s .^ST, It Is Invariant with respect 
j f j I . ^ 

to permutatlons^of the elements of S. On the other hand, if the 

order of the elements of S Is reversed, so that S, = S . . 

I n-i+i 

the U for the new dominance matrix wl IP become U' = E Z n'.. = 

' K j>l 

n n 

Z 



^ ^^"^n-l+l^^n-j+1 ^^^^^ can l>e shown to be equal to 



U - U^. 

a . 



8 



Therefore C (O). - 2(U-U^)/U 

P 3 



- -2U /U + 1 - -C„(0) 
a p 



1 



property 3 : The consistency index Cp(0) associated wtttv a 2 x n data. 



mi 



itrfx, comprising two response-pattern vectors S, and S«,^ is a 



weighted average of the Cp(0)*s associated with and respectively. 



If S » 



, the consistency Index for S Is 



Cp(0) - (S', S'j) 



CIS + S* S 
5> + ^ 2^2 



Therefore, if we let 



U. •= E E n:?' fjor k - 1,2' 
k- j , IJ ^ 



and 



I J>l 



it follows that the U and for S are given -by 



and 



"a-^1a^"2a ' 



Hence I 



Cp(0) 



^^a ^ "2a^ 
U, + 



2U 



la 



2U 



2a 



U, + U, 



U, + U2 ,U2. 



15 

•9 



i 



W^C (Q), + W2C (0)2 



Remark ; /The two response* patterns S and may be either those of 

two Individuals are of a single individual taking a set of Items on 

two occaslpffs C^s In a repeated-measures design) or two parallel . ^ 

■■-^ . ■ ■ ■ 

sets of Items. In the first case the C (O) associated with. the 2xn 

datel iT)atrU would be an average 0^(0) for the pair of Individuals; 

Tn the second, It wodld be an average over two measurement occasions 

for one person. • 

By mathematical Induction on Property 3, It follows that the C^iO) 

associated with an N x n data matrix ' ^ ' - 



r * 



Is a weighted average of the C (0)'s associated with the Individual 

P 

response-pattern vectors Sj,'S2> ... » Sj^. In part^cul^[r, when the • 
' Items are arranged In descending order of difficulty for the group 
' comprising the M Individuals, the Cp(0) associated- wl th X Is one <^f 
Cliff's C' 977) consistency Indices, C^j. For thrr-partilcular ordering, 
of Items, we give the name "norm conformity Irfdex" to the 0^(0) 's 



ERIC 



associated with the individual response patterns. • * . 

Definition ; Norm Conformity Index, NCI • • 

When the item ordering is in descending order^^o/ difficulty for 

a particular group (designated the "norm group'!) , the consistency Index 

Cp(0) associated with the indi\fldual 's response pattern S Is callfed 

the norm conformity index , denoted by NCI. '^hus, NCI Indicates the 

extent to which :a response vector S approximates the Guttman vectqf ^ 

'r ^ , ^ > 

(in which all the zeros are to the left of th^ Ps) with the same 

■ . i ^ ^ ■ ' ■ 

number of I's, when the items are. arranged In descending order of 

difficulty for the norm group. \ :^ , 

With this definition, plti's an expanded version of Property 3, 

we state the relationship between CI iff 's consistency index C - and 

the NCI's for the individual s in t^e ^roup as - , 

... /' ' 

Property h : Cliff's consistency index C^^ Is a weighted average 

of the NCI|^ (k»1, 2, N) , with weights w^ = U^/U; l.e.,S^ 

'ti ^V^) NCI,, . > . ^ _ 



where' 



and 



^ I J>l 



N 

E U, 



Example kl Let S^— (01011), $2 "^ioOlfl) and = (00001) be 
the response-pattern vectors for three Individuals. Then, by 
'calculations similar to those shown In Examples 1,2, and 3, we get 



(ujjon writing NCI for C (O)) 

P 



"la U, =6, aN9, »2/3 



Hence, 



w,NCI, + W2NCI2 + w^NCI^ = (6/16) (2/3) + (6/,l6) (l) + (Vl6)(l) 

' = 7/8 . 



On the other hand, with 



X'X = N » 



y 

01011 

a A 

0 0 0 0 # 
0 0 1 Ivl 



•0 1 123 

0 0 .1 12 

0 10 12 

0 0 0 0 1 

0 0 0 0 0 




15, U = 16, NCI = 30/16 - 1 



7/8, 



thus verifying Property k. 




In .the paragraph preceding Property ^, the order of the Items 
was taken to be theorder of ^K^lculty for the group of which 
the Individual was a member, for C (O) to be called NCI. Actually, 



as evident In the formal def Inltlort of NCI , the group rieed not be 



one to whichjthe IndJvldual belongs. It cjan be any group which the 
researcher chooses for defining the baseline or ''criterion order" 
of the h^s; hence our referring to It a^jithe^rm group, and the 
Index as the norm conformity Index. Thus, for example, we might be 

■ V "1 

concerned with two groups of students with vastly different Instruc- 
tlonal backgrounds but similar abllli^lesy; It Is then quite possible 
for the difficulties, of various skill? to be Vather different In the 
•two groups. We ml^t take Group I as the norm group, thus arranging the 
Items In descending order of dlWlculty for this group. We could 
compute NCTs for members of both Group 1 and Group 2 on the basis of - 
this* crj terloii ord^r, and would probably find the mean NCI for the 
tw9 groups to be significantly different. The following examples^ 
based on real data, Illustrate this. - , ^ 

Example 5; The seventh grade studfents of ii Junior high school 
were divided at random Into two groups, which were given different 
lessons teaching slgned-number operations. (Tatsuoka S BIrenbaum, 
1979). One sequence of lessons taught the operations by the Postman ^ 
Stories approach (Davis., 1964) y^hlle the other used the number-line 
method. " \ ; 

After addition problems had been tau^t, a 52- Item test Including 
both addition and subtraction problems was administered to all students, 

A t-rtest showed no significant difference between the mean test 
score of the two groups, as In'dlcated In Table 1. However, when NCI's' 
were computetl for all students, using the Item-dlf f Iculty order In 



i 



■ f ■ 

Group r (the^Postman-Stori^ groflp) a? the.baseHne, tt^fere was a , " ^ 
slgnf^lcant dffference between the mearv NC I of^ two groupsi : 



Insert Table 1 about hfet^ 



The means of test scores and the Norm Confo^iriljy Ind 



.Total Score 
\ mean 
\ SD 



NCI 



mean 
SD 



Group 1 (N « 67) 



20.06 
8.30 

.55 
.23 



Group 2 (N 



= 62) 



18.36 
7.88 

.23 



t = 1.190 
P > .05 - 



t == 2.2'»6 



.0264 



Example 6 : fatsuoka 6 BIrenbaum (1979) demonstrated thcit 
• proactive Inhibition affected the perfonpance on tests In material learned 
through subsequent Instructions. The response patterns of students who 
studied new' lessons written by usjng a dl^fierent conceptual framework 
from that of their previous Instruct Ions' showed a significantly different 
performance pattern. By a cluster analysis, four groups amonti which 
response" patterns are" signtf Tcantly diffbrent weVe Identified! The NC| 
values for 31 students based on the order of tasks determined |by\he 



proportion correct In the total sample were ca leu l.^^ted and analysis of 
Variance was carried out. The F-value was significant at p 0.05. 



Insert Table 2 about here 
% 



er|c 



Table 2 

ANOVA of Norm Conformity Index for Four Groups 
V/lth Olffe^tent Instructional Backgrounds 



Group 


N 


Mean of NCIs 


F 


1 


3^* 


0.18 


. 3,62 with df » 3, 87 


2 


27 


0.i*l 




3 


20 


0.35 






10 


0.18 





Up to this point, the and U In Eq. (2) defining C (O) ~ and 
Kence NCI as a special case — • were deflncSd In terms of the numbers of 
dominances and counter-dominances^ between Item pairs In the dominance 
matrix N* We now show that can be explicitly defined In terms of 
the proximity of a response vector S to a ttuttman vector.™wj|ih the same 
number of 1 's. ' 

Property 5 : Let S be a response-pattern vector of an examinee on an 
n-ltem tes't, N = S'S the associated domlnance^atr Ix, and ^ 



n 

2 



Z n, 



I=-l j>I. V :y 

then p Is also. the number of transpositions required to get from S to 
the Reversed Guttman vector (all Is preceding the zeros) . 
^Slnce ng » (1 - s,)sj , It fol lows that 

■ . r 

U_ - Z Z (1 - s,)s, 
^ I j>l I J , 



15 



ts the number of ordered pairs (sj, Sj) [i<j] of elements of S 
such that Sj « 0 arid Sj « K That Is, If for each Sj » 0, we count the 
number of Sj » 1 to Its right In S, then the sum of these numbers over 
/the set of O's In S Is equal to U^. But this Is the same as the' 

a 

number of transpositions (interchanges of elements in adjacent (0,1) 
pairs) needed to transform S, step by step, into (1 1. ... 1 0 0 ... 0) .\ 
Thus U is a measure of remoteness of S from the reversed Guttman 



vector, which is equivalently its proximity to the Guttman vector, 



Example 7 ! Let S = ( 0 1 0 -1 1 ) . Then, S can be transformed ; 
Into (1 1 1 0 0 ) by five successive transpos l^^<ons : 

(0 1 0 1 1) ->■ (1 0 0 1 1) (1 0 1 0 1) 
(1 .1 0 OJ) ->■ (1 1 0 1 0) (1 1 1 0 0); 

thus Ug = 5 by the present definition. On the other hand, . 



N = 



• 1 
0 
1 

0 
0 



[01011] = 



0 l^^l 1 
0 0 0 0 0 
0 10 11. 

0 0 0 0 0 
0 6 0 0 d 

by the earlier definition. 



and = Z Z n, . = J 

It may also be noted that, if we denote the number of 1 ' s 
the lower triangle of ?'S by U^^, i.e.. 



in 



E E n 



then Ujjisthe numberof ordered pairs (sj, Sj) [j'< i] of elements 
of S such that Sj - 0 and Sj » 1. Hence, , 



16 

4 % - 



j I 'J , ^ ^ 

Is the number of pat rs (sj, Sj) with Sj V Sj that c^n be formed from 

the elements of S. Thus, U » x(n - x) , where x Is the number of \Ps 

In S, ojr th^ test score yarned by a persor> with response pattern S. 

Consequently, U^/U and Uj^/U are the proportions of (0^1) pairs and 

(1,0) pairs, respectively, among all possible ordered pairs (sj, Sj) . 

[l < j] of unlike elements. When S Is a Guttman vector, (0 0 ...0 1 1 1) 

V ^^"^ " °» because all ordered pairs of unlike elements are 

(0,1) pairs. Conversely, when S Is a reversed Guttman vector 

(1 1 •:. 1-0 0 0), « 0 and U^^ » U. Hence U^/U ranges from ^ 

0 to 1 as an Increasing function of the degree to which S resembles 

(or Is proximal to) a Guttman vector. Similarly, Uj^/U measures 

the proximity o f S to a reverse Guttman vector, or Its remoteness 

from a Guttman vector, jn fact U^/U was denoted by u' and proposed 

as an Index of "deviance" c^f score patterns by van der Filer (1977).^ ^ 

With the above redefinition of and U> the sense In which NCI 

a « 

Is a measure of the extent to which a response pattern approximates 
a Guttman vector should have become clearer. 

NCI - 2U /U - 1 . e - ^ 

a 

Is a rescaling of U^/U to have limits I and -1 Instead of 1 and 0." 

.It should be noted that U^/U, and hence also NCI, Is undefined 



17 

for a person who has a test score of either 0 or n, since U « x(n - x) 
« 0 In both these cases. There are two ways (at least) In which to cope 
wlttf^thls problem. The first Is arbitrarily to set NCI » 1 when U = 0, 
which Is analogous to setting O! « 1, This js reasonable because 
U « 0 only for S - (0 0 0) and S - (1 1 ... 1), both of which 
are Guttman vectors In the sense of having no zero to the right of 
any 1. The second solution Is to redefine NCI Itself as 

(3) NCI = 2(Ug + 1)/(U + 1) - 1 ; 

which will automatically make NCI - 1 for the all-correct and 
II -J 

all- Incorrect response patterns. Each of these solutions, however, 
gives rise to problems of Its own, as shown In the discussion section 
below, V 

Property 6 ! Suppose and $2 are two n- 1 tern response patterns ' ^ 

with the same- number x^o? Ts, and that $2 results from S. by 

applying t successive transpositions. Then 

. ' ' - ' ' ) ^' 

Cp(S2) « Cp(S,) ± 2t/x(n - x) 

where the + sign Is taken when $2 Is dloser to a Guttman vector than 
Is and the - sign when t>ie ^oppos l.te Is true, # 

From Property 5 the associated with a given response pattern 
S Is the number of transpositions necessary for getting from S to the 
reversed Guttman vector with the same number of Vs. Hence, if t 
Is'the number of transpositions it takes to get from to It 



18 

4;- 



follows that 



if $2 Is farther from the reversed Guttman vector, i.e., closer to 
the Guttman vector, than is and 

2a la * 

If the opposite Is true. Consequently, ^ 

2U. 2(U, + t) 

U ' ' u 

= C (S,) + . , 

when $2 fs closer than to the Guttman vector. The sign preceding 
2t/x(n-x) becomes - when $2 Is farther than Sj to the Guttman vector. 

Example 8 ; . Let = (1 O 1 0 1 1 ) and $2 = (O 1 0 1 1 I) . 
It takes two transjjos Itlons to get from to $2 : 

( 1 0 1 0 1 1) (0 1 1 0 1 1) (0 1 0 1 1 1). 

and $2 Is closer than Sy to the Guttman vector (O 0 1 1 1 1). Therefore, 
by Property 6 we should have 



S^^2> " ^p(Si) + (2)(2)/(4)(2) 
- Cp(S,) + 1/2 . 



For the two response patterns, we have 



19 



so that 



and 



- 5 and - 7, 



C (S,) « <2) (5)/8 - 1 - 



CpCSj) - (2)(7)/8 - 1 « 3A 



satisfying the above relation/ 

Property 7* The weight 
computing CI iff s consistency 
of changes In the basel ine orde 




plied to individual NCI's in 

^t1 Property k) are invariant 

emsl # 



This is true because the weights 



w 



U 

P 



depend only on = x^{n - s^),' where is the total score earned 
^Ji^MM^rson p i.e., on the number of Ps In response pattern S , and 
not on their positions. • 

It follows that NCI's associated with response patterns ' 
yielding scores close to n/2 get high weights while those corresponding 
to extreme scores get low weights. It is also seen that' when the number 
of persons Is large, each Wp Is a fairly small positive number, while 
the NCI has a value between 1 and -K Negative NCMs are an obstruction 
to having a large group consistency index, 



On 



INDIVIDUAL CONSISTENCY INDEX 



I 



20 



In th« preceding eectlon we deflnedi and described verlous 
properties of, an Index which measures the extent to which an 
Individual's reafporise pattern ^'conforms'* to that of a norm group. 
In some situations It Is desirable to measure the extent to which an 
Individual's response pattern remains unchanged or "consistent" over 
the passage of time. For example, It. Is reasonable to expect that, 
when a student Is In the process of learning and hence presumably 
modifying the cognitive processes by which he/she attempts to solve 
problems " his/her pattern of responses on successive sets of similar 
Items will change considerably from one set to the next. When the 
student approaches mastery or a "learning plateau", his/her response 
pattern will probably remain relatively consistent from one set to 
the next. To define an Index, called the Individual Consistency Index 
(ICI), that will serve to measure the degree of consistency (or 
stability) of an Individual's response pattern over time, and to 
Investigate Its properties, are tbi^: purposes of this section. In the 
Interest of clarity and ease of exposition, we embed our discussions 
In the context of an actual experimental study. 

A 6A-ltein, signed-number test was administered to 153 seventh 
graders at a junior high school* The test comprised 16 different 
tasks being tested by four parallel Items? each. The Items were arranged 
so that four parallel subtests were successively given to each testee. 
WItljIn each 16- 1 tern subtest, the order of Items was randomized. Thus, 
for each examinee there are four response-pattern vectors with 16 



4^ 



21 



alements each. The' Inctlvldual don 8i latency index (ICI) Is defined on 

these four repl Icatlonf: We shall come back to this test later, bfut we 

. ' '.1 * 

first Introduce ICI by a simpler example. Suppose a person took foij^r 
parallel tests A, B, C, D with seven Items each, and that his/her 

I 

response patterns were as shown In the second column of Table 3 . Also 
shown In this table are U - x(7 x) for each response pattern, the 
number of transpositions needed to trartsform each response pattern 
Into a reverse Guttman vector, the Cp(0) for each response pattern, and 
the weight to be applied to ^ch Cp(0) for getting an overall Index. 



Four rest^onse patterns andwarlous quantities 
a s soc I a t ed T^Ututhem . . 



paral lei 
test #(j) 


Response 
Pattern 




^a 


Cp(\ 




1 


(IplOOlO) 


12 


k 


-.'333 


\ .286 


2 


(0010010) 


10 


6 


.200 


\ .238 • 


3 • 


(1000010) 


10 


k 


-.200 


.238 


M 


(1000010) 


10 


it 


-.200 


.238 



The weighted average 



2 W: C^(6) 
j«l 



j -P'-j 



-.1^3 



would be Cliff's consistency Index C^^ If the four response patterns of 
Table 2 were those of four Individuals and If the Items had been arranged 
In their difficulty order for the group. Let us rearrange the Items (or 
rather the sets of parallel Items) In their order of difficulty for the 



28 



22 



for the person, which Is (2,l#,5i7p3 J ,6) • The response patterns and . 
other quantities occurring In Table 3 now become as shown In Table ^» 
which also has a new column showing the number of transpositions t, 
necessary to get from the Jth response pattern In Table 3 to the new 
one here. 

% Table k 

Response patterns resulting from those In 
Table 2 by. arranging the Items In difficulty 
h order, and various associated quantities. 



Paral lei 
test # (J) 


Response 
Pattern 






Ja 




rupf 

WW 




t 


(0000111) 


8 


12 


12 


' 1.0 


>286 




2 


(0000101) 


3 


10 


9 




.238 


f 


3 


(0000011) 


6 


10 


10 


l.O 


.238 


k 


(0000011) 


6 


10 


10 


l.O 


.238 





Note that the new Cp(0) for each response pattern satisfies Property 6: 

The weighted average of the new C (O) values Is 
k 

2 w.C (0'), « .9524 . 
j«l P J \ . 

This Is what we ball the Individual Conformity Index, ICI. We may state 
Its definition as^ol lows • 

Definition ; Individual Conformity Index (ICI). Given a set of 
response patterns shown by a single Individual on a set of parallel 
testsj we arrange the parallel Items In their overall order of difficulty 



for the Individual and qompute the Cp(0) for each response pattern 

^ . * 

thus modified. If we now form a weighted average of the*e^ (0)*s 

P 

as though we were computing Cliff's C^, In accordance wl th Property A > 
the result Is the IC I. 

' Remark ; Note that I CI Is an attribute of a single Individual , 
not of a. group asjs Cliff's consistency Index. ICI differs also from 
j/ICI In that the latter (also an Individual attribute) depends on the 
baseline order of Items, I ^e., the difficulty order In some group 
specified as the norm group, whereas ICI Is computed for an Individual 
with no reference to any group. Rather,^ ICI requires that the ; 
Individual In question has taken two or more parallel tests, and 
measures the consistency of his/her response patterns across these 
paral lei tests. ^ 

Property 8; Since the parallel Items are arranged In their 
order of difficulty for the Individual In question when ICf Is computed, 
whili they are arranged In their order of difficulty for a norm group 
whenTl^ Is computed. It follows that 

ICI ^ NCI 

for each ^amlnee. 



30 



APPLICATION TO ERROR ANALYSI Si I 

^ BIrenbaum & Tattuoka (I98O) found that 1-0 scoring based simply 
on right or wrong answers caus^id serious problems when erroneous rujes 
of 8 1 gned-number operations were used by many examinees. The point Is 
that many erroneous rules can lead to correct answers In many Items. 

To highlight the extent of, the problem, Tatsuoka ot al^. (I98O) 
developed an almost exhaustive set of 72 erroneoUs rules for doing 
addition and subtraction pf pairs of signed numbers, and enumerated the 
number of correct answers that would result from consistently using 
each Incorrect rule for a set of 16 Items. The resulting histogram 
Is shown In FJfgure 1, where It can be seen that, In an extreme case, 
12 out of the 16 Items could be answered correctly by an erroneous 
rule of operat Ion. 

■ ■ * ■ 

Using real data from a 6^- 1 tern test consisting of .four parallel 
subtests of 16 Items each, BIrenbaum & T^tsuoka first did a principal 
components analysis on the original data — with the Items scored 1 or 
0 In the usual manner. Next, the data were mddlfled by giving a score 
of 0 when an Item was correctly answered presumably by use of an 
erroneous rule, and another principal components analysis was done. 
(Details of how It was Judged that a correct answer was arrived at 
by an Incorrect rufe are given In BIrenbaum & Tatsuoka, I980.) The 
change 'between the two analyses was dramatic. The dimensionality of 
the data became much more ctearcut with the modified data. The 
Item-total correlations became much higher, while the means of the 16 
tasks (each represented by four parallel Items) did not change 
significantly. 



23 



•'V 



2« T 




m M ^9^m ^ ^ ft.^ »^ is> ro ro M 
• • • • • in to a^mm j^Oi 

dH'^ Ik ttl * * '* m ' m . m s-*^ 

SCORE 



Figure 1. Histogram of. total scores generated 
erroneous t-ules of operation. , 



' .If 



ERIC 



32 



Thu «i)ov« phanomtnon iiuggMt^ why «om0 ac to»ta cannot be 

tr««ted ail unldlmaniilonal evan thouQh tha Itama ara takan from a alngla ^ 
content dowaint The fact tha^; correct anawar« can be o|)talned by arroneous' 
rules appiirently makes for a chaotic^ mffltJdlnienslonal d^omaln (Tatauoka & 
BIrenbaum, 1979), which la Vcleaned up'* by the reacoring. Brown s Burton 
(1978) wa-ned of the iame problem, namely that wrong rules can yield the 
correct answers In some Items Involving addition and subtraction of 
positive Integers. * ( 

The Indices developed In this paper, NCI and ICI, are useful for 
detecting erroneous rules that are consistently used by an examinee or a 
group of examinees. This capability Is useful not only In the teaching 
process, for diagnosing a student's problem, but also gives some leads t€i 
addressing some psychometric Issues such as the^dlmenslonal Ity of 
achievement tests. 



Table\5 shows a 2 x 2 contingency table based on combinations of ^ 
htgh and l<|w NCI and ICI values, with a characterization of the status 
of students In each cell, dependent also on the score earned. 

Table 5 

Types of studer^ts with high and low NCI, ICI and score. 





Low ^ 


High 




There should be few/students 


If score Is high, all. Is well. 


High 


In this eel 1 (none If the 
cutting points for ICI and 
NCI are the same, since 
ICI > NCI always) . 


>lf score Is low, student has a 
ser lous mj sconcept Ion (cons 1 stent 1 y 
uses an Incorrect rule) which, how- 
ever, leads to correct answers to 
easy Items and wrong answers to 
hard I terns « 




The errors are probably 


If score Is high, student is merely 




randomog 


getting a few of the easy Items 
wrong • 


Low 




W score Is low, student |s getting 
many of the easy Items wrong. The 
response pattern Is strange, and a 






serious problem exists. 



) 27 

■ V ■ ■ ... 

KKampln 9 > Th« #HimpU cUftcrlbnid tha b«alnnlnq of thi> ^aatlon on 
th« Individual Con«l»t«(ncy lnd«iK w«ii thoroughly «n«ilya;dd In Tiachnk^l 
Nport 80-1 (BIronbaum J.Tatftuokd^ 198oj with ralpact to error analyaosi. 
Wo cull this data the November data hereafter. ,There are 16 different 
erroneous rules dlagnoaed^ In the reports Table 6 $how« the response 
patterns and NCI and ICI lyalues for three students, Student 1 performed 
ell addltloh problems coCrectly but he failed to change the sicjn of 
the subtrahend yh^ he/^llverted subtraction problems to addition problems. 
Student 2 always added T:(|e two numbers and took the sign of larger number 
for her answers. She faliied to discriminate subtraction problems from 
addition problems ^nd am]}jkd this erroneous rule consistently to all 
16 tasks. StudenHs achllvefel fairly well but he occasional ly mistyped 
or made careless|/|l stakes • |*X®^®'^'"J" rules of p^er^tlon, both 

right and wrong(^ Is dlscudLed . In detal 1 In the technical report 80-2 
(Tatsuoka et al.,J980)Wy, - 



Tibli 6 

Th« rtiponia pattarni, NCI, ICt 
of thrtt itud«nci. . 







R««pbniai Co fouc aiirlVIs 


1 fbrmi w1 thin 
t 








th« 6<»'-||«n (ffi 




Studant 1 


Stitdant 2 


Studant 3 




6 


nil 


(+10) 


1111 (10) 

fill 


1111 


18 


-6 + i» 


nil 


(-2) 


0000 («*10) 




3 


U + -3 


nil 


(9) 


0000 (-fl'5) ' 


1111 


5 


•"^^ -3 + 12 


1011 


(9) 


ObOO (+15) 


1111 


10 


-1> + -5 


nil 


(-19) 


nil (H9)| 


1111 


n 


3 + -5 


nil 


(-2) 


0000 {*6) j 


11 11 




-5 + -7 


nil 


(-U) 


nil (-12)1 


1 11 1 


7 


'8-6 


0000 


(+!'•) 


0000 (+ll») 


. 1111 • 


Q 
0 


-16 - (-7) 


0000 


(-23) 


0000 (-23) 


nil 


16 


2-11 


0000 


(+13) 


0000 (+13) 


0111 


13 


-3 - +12 


0000 


(+9) 


OOpO (+15) 


nil 


1 


-6 - (-8) 


0000 


(-I^J) 


0000 (-lVl») 


nil 


12 




0000 


(+2) 


nil (+16) 


oon 


k 


1 - (-10) 


0000 


(-9) 


0000 (-11) 


1010 


2 


-7 -'9 


pooo 


(+2)' 


0000 (+16) 


11 11 


9 


-12 - 3 


0000 


(-9) 


nil (-15) 

t " 


0111 




NCI 


0.9759 


-0.2560 


0.7073 




ICI 


1.0000 


^ 1.0000 


.9268 




Score • 


27 . 


20 


58 



tasks are ordered by their overall difficulties over four 
parallel forms. 



Of:- 



V 



palftt fc)r ICI inU for NCI In th# ^lu^jiimpli cif 7S i>tMi|#fUi who «>#*tim| 
icarni of 53 or hl^h^ir, whlln JibU 6(b) 
47 stMcinnU with ncor^n of or tow«^r, 
icor«« wii Qhattn btciUiii showrv In 

<iMbt«it: of 16 ItmM could conc^Eilvi^bly N 



U iht corr<i.«pomMnp t<ibN for 
Th0 cllvldinu poliu, S2, for 
Plflurd 1 13! oMt of «i<ich 
^niiw<dr«$il by con^iUt^ru U5*«> of 



iin «rroniiQuii ruUj 13 la thu titn^lUfit number of \tmn thut cannot b^ yoi 
correct In thi* w«yi which corroaponcls to 52 out of tha tjnttrts t^sit of 
four parallal subc«ist» of 16 Items each. Hence It %t^mn reasonabla to 
regard S2 or less as "low scores**. 



Table r ^ 

Two-way classifications based oh ICI > or < *90 
and NCI > or < .60 among students with 
(a) scores > 53; (b) scores < 52. 



(1 













.90 


.90 




.60 


8 


18 


26 


.60 


26 


23 






3k 




75 





.90 


.90 




' .60 


3 


27 


30 


.60 


12 


5 . 


17 




15 


32 


^♦7 



(a) 



(b), 



Let us see what we can say about the performances of the students . 

represented In the November data, from the; contingency tables of Table 7, 

i - 

In light of the characterizations given In Table 5 and with the three 
students' response patterns In Table 6 to guide us to some extent. Note 



7~- 



^'0 



30 • 

f\r*t thitt, (i«ipl(t Ihi vtry hlBNawi«c>ff p«lni of ,90 for m "hlqh 
Id*' 6iitJ»9ry, •M^iciniliMy tmr§ mn m\§«^\ii\f (n omk «f \n) af thti 
i(ud«nti hiVf high ICI v«)uai. Thli r«fUcti th» fstsi th«t th« *M8titln*«« 
w«r« alghth t|r«d«rii who hi<l alrAidy .i'«c«ly«d fairly «tMtiin«lv« ln«ti'M<:t lao 
In it9n«fi-numb«r optritl^ni and h§m% « rtliUvtly l«P8^ proportion of thtm 
•how«d rf|iponiii p<itt«rni ovfr th« four p«r«IUl lubtnitu! thuy 

h«d ilrtidy approichad niatttry^'r l««rnlnfl pUc««ui th« l>ttir balivj 
mor« llk«)y In thU cii« In vUw of th« fact that only 75 (or 61.5*) of 
them had •ctorei avar 52 out of 6l». A« expactad, very few itudent* 
(II out of 122) had low IC I » 'comblned.j^l th high NCIi. Many more had 
low-ICI , low-NCI combinational; ttiese are studanti who made more or leas 
••random" (or at laait non-syitematic) arrori but who naverthal*e»« made 
relatively more errora among items that were easy for the group as a 
whole. It Is reasonable that abqut 70* of the low scorers who had 
low NCI s fell In this category, while only 53* of the high scorers with 
low NCIs did. 

Returning to the high- IC 1, group, members of whichjthe three students 
represented In Table 6 are different kinds of examples, the htgh-fCI, 
hIgh-NCI students with high scores are the ''problem-free'' types exemplified 
by Student 3. Unfortunately there are only 18 such students while there 
are 27 hIgh-ICt, hIgh-NCI students with low scores. Student 1 Is an 
example of this ty.pe of student, and his response patterns corroborate 
the characterization Ip^ f^^^ 5, that he has a serious nUsconcept Ion, but 
one which leads to coir^cit answers (except for one ^robaMy careless 
error) to the easy Items (addition) and always to wrong answers ^to the 

■/ 



unfarcMniitt faai ihit i f«w niiy iiiNni mUinii i^iMiii tht NCI ta 

in th$ hl||h^lCI| Niw*NCI \m a^Qormr^i of whw thtffi iir# farUmKi«ly 
0fi)y fivif ltU|4int % In TibU ^ tMimpltfUi thU iy^«i| htr 
unuiMii f«t«ponii» pittcir^n |whic;h rmnnln'k p^rfiicily a)n^l«it«nt ovcii 
t;h« four piir<ilt#1 ^«Mbt;i»i»t:^) will tnkkm qu\tm ^ bit of rmmiWml Imiructlaw 
to rectify. 



\ 



31 

Ai»i»iiCMiaK TO mm Amviiii n • 

•i<«Mlnlr»a IN ICI-MCI-icari cowblntt with «««h i^M«»ultv »lletwiic»«U#il 

th« •Jiisnt to which « il«tiiiit hm b«i«n "eofinii«|n«ti«J'< In f(»# i«ti«« uf 
I'^^Ui hivlna Iti UfiNlmnnilanillty «l#firoy«d by th« «ort»Ui«fU occurr*,,^* 
af •rron.oM* ruUi of op#rntl«n. rhl« c«n b« aorm by ilr«wlnu « ic^tur- 
plot of NCI •9tintt tot#l »car« for tUm ylv.n (l«t«i«t unU ccKnpurlnu It 
with tMm corr«ipontlln« «ic«tt«rplot t>iit««l un •yiuhntk »li»t« ue»»«ir-i«.l liy 
th« 72 .rronsoui cuUi r«f«rr«d to at thu b^alrmlna of tha pravlou* 
••ctlon. It ti conv«nl«nt «Uo to draw tha r«af«ttloii ltn« of MCI on 
icor* for the lynthatlc d«t«. 

Thm proc.dur* Is lllustr«t«d In FlQur. 2. wh«r« th«-«c«tt»rplot of 
NCI «i3«inlt "t«»k-m«it«ry tcoM'* for tha Nov«mb«r diita (point* 
raprasantad by X'l) 1$ suparlmpotad on tha corraiponding acattarp'lot for 
tha artlflcally prodycad data generated by arroneouj rules (point* 
rapTasented by o's) with the regression line shown. It can be seen at a 
glance that almost all of the real data points fall above the regression 
line for the synthetic data. In fact, a large majority of them even 
fall above the dashed line parallel to the regression line, which Is 
drawn one S.E. above the latter. 



*The "task-mastery score" Is defined as follows: If a student oets 

:^s e^y^scorTfoJ ttt'ta^V?"!'" ^''^f a g?v::t sJ.^M^her 

mascery score for that task [s 1; otherwise it is 0. 



' -13 

tH« ifvtniN ^fi^trt fitltil il^lir Nfn^ iHMaMi %||ifii««| Hun^l^^r apiii tM«*tii 
¥tA 1^ PirAlO i«i«|i|if^| tf iiMi oiif HciMir ##1,^ | fl»(iav#) . 

t'ittMrt Hi fiigtjtii; Ihn ihtl^ Minn Mf «irfuo<iaM« rMl«ii #|KiMn4fi^io 




m ■ 

J 

Figure l^ Scatterptot of NC| vs. total score for the 
November mastery score data, superimposed on that. for 
72 synthetic response patterns generoted by erron^^ous rules. 

X « real data (N « 127) 

o - synthetic data (N 72) , 



. EXTRACT I NG UN I D I MENS I ONAL SUBSETS 
Item Characteristic Curve (ICC) theories are useful and powerful 
test-theory models especially for applications In adaptive testing. If 
the test Items are drawn from a single, unldlmens lonal domain, logistic 
models are convenient for estimating the Item-curve parameters. Tatsuoka 
(1980) examined the response patterns of students for whom ability 
est Imates wl th known Item parameters failed to converge by the maxlmum- 
1 Ikel Ihood method, and found them all tp have low NCI values. Conversely, 
when the NCI Is very small, the max I mum- 1 Ike 1 1 hood method of ten fails 
to yield a convergent estimate for the ability variable 0. Table ^ shows 
three such response patterns (along with one for which the 0 estimate did 
converge) for the 48-item Stanford Vocabulary Test taken by the seventh 
graders In the January experiment. The item parameters for the three- 
parameter Joglstic model fitted to these data were estimated by LOGIST 
(Wood, WIngersky & Lord, 1976)., 

Table 8 



One Convergent and Three Nonconvergent Response Patterns 
For Estimating 9 by the MXL Method and Their iJCI Values 



1 I' ■ ■ ■ >'j 

Response Patterns 
(48 Items ordered rouglTly from easier to harder) 


Esti- 
mated 
e 


NCI 


No. of 
itera- 
t ions 


1 ^ 0000 1 0 1 1 00000000 1 000 1 1 000000000 1 0000 1 0000000 1 00 1 


-3.789 


.0158 


25" 


3^ 111100001000000000000000000100010000100110000000 


-1.335 


..2526 


7 


k 000000000000000000000000000000000000001 111111111 


-^.957 . 


-1.0 


25 


5 00000600000000000000000000001 11111111 1 0000000000 


-25.00 




25 



a Iterations were terminated at 25 tentatively, but the decrements of three 
C2|fes exceeded .001. 

b These two response patterns are taken from the real data and the other 

£wo~^Te~h7porhet1xa^l~respm^ ~^ " 



36 

The problem of non-convergence of max I mum- 1 Ike 1 1 hood estimation 
procedures for ICC models due to failure of the data to exhibit unl- 
dimensionality has been plaguing researchers for a long time. Mokken 
(1970) goes as far as to state that, although |CC theory has many 
valuable features, studies In sociology and political science are not 
quite ready to take advantage of the refined parameter estimation 
methods that It offers. - To cope with this situation, he developed 
a techlnque for extracting scalable sut)sets of Items from a given 
dataset. Similar techniques have been developed In the fields of 
educational and psychological testing (Krus, 1977; Reynolds, 1976; 
Yamamoto & Wise, 1980). Theoretically, all these mejjfiods (which aYe 
based on order analysis) show a dual I ty between Items and examinees, 
and hence can In principle be used for extracting subsets of examinees 
as well as Items In which unldlmenslonal Ity will hold. |n- practice,' 
however, most If not all of the^ would be quite Ineflclent for extracting 
examinee subsets because the dominance matrix for this purpose would 
be of order equal to the number of examinees Instead of %est Items,, 
and would thus be very large. We therefore present a new technique, 
based on the NCI, for extracting unldlmenslonal examinee subsets that 
do^s not require the use of a dominance matrix as the starting point 
and hence Is probably more efficient than those that do (although we have 
not yet made a formal comparison). Our technique Is, of course, just 
as applicable for extracting unldlmenslonal Item subsets, but In that 
case Its advantage. If any, over previous methods Is probably negligible. 



37 

Recal 1 Ing that Cliff's^roup consistency index C^^ Is a 
weighted average of the NCI's of the members of the group (with, the 
group Itself defining the ttem-dlff Iculty order), It Is clear that 
Individual s with negative NCI vaJues are detrimental to the goal of 
getting a large C^;| value. We therefore remove ar|y Individual with a 
negative NCI from the group at the outset, before starting our extraction 

procedures. Let the NCI for the J-th member of the group thus purged be 

•■' ^ 

A 2U. 

• ^ NCI . = - 1 . 

J U. 

^ J 

Then, by Property h the overall consistency of the group, whose^ze 
we denote by N, is 



N 

C (Sj - 2 w, NCI , 
P N J J 



where 



N 

w. = U./ 2 U. 
J J j=i J 



Now suppose we remove the k-th member of the group and denote the 
reduced data matrix by ^^^y Then , 



*lt is realized that removal of an individual from a group may con- 
ceivably alter the difficulty-order of the items, and hence the NCI's; 
but this is improbable, especial ly when the deleted individual has a 
small NCI, as we shal 1 presently )5ee he/she does. 



where 



N 

C„0<M .) - I w^NCI, 



The resulting change In overall consistency Is 



But 



- Z w-fNCI , - Z w.NCK 

■ N . . . ■ ■ - -N., ..„.. 

r w:nc I ■ w;;'nc i . j - w/nc i 
j-i J J J J 

N 

= I (w-r - w.)NCI, - w'NCI. . 
j»l J J J 



u u, 

j J N N 

I U , - U. Z U, 

jol j k J 



"j (u. -\)u. " u. - u. 



N 

where U. is an abbreviation for E U . 

, j-i J , 

Therefore 

U N 

(k) AC « E w.NCI . - wj'NCI. 

P U. - j-1 J > ^ 



u. - u, 



Cp(X^) - NCI, 



since 



We thus see that. In order to make AC^ as large as possible— - that is, 

to have the removal of the k-th member result In as large an Increase 

In C as possible — two conditions must be satisfied; namely, NC;l'l^ -should 

be as small as possible, while U. should be as lar^e as pbssible^ Ttie^ ^ 

- ■ ■ '' ^ ; " - : 

first of these conditions Is Intuitively obvious. Since the overall 
C Is a weighted average of the individual NCI's^ el imination of the > 
smallest NCI would be expected to increase the group the most» 

' ■ . . - ' *^ * V''' '^ : ' ^• 

However, since the latter Is a weighted rather than a straight average 

. ' .. \. ' . ^ , 
of the NCI's, It is also necessary that the NCf to be eHmiria'ted have 

■. 'i ■ , , " 

,as high a weight as possible, namely, that the associated llj^ be as 

large as possible, Recal 1 Ing that ^ J 

' ■■■ ' ■ ■■ ■ ■ ' / ■ ■ > ' ' ^ 

It followi^ll^^t should be as close as pgssibje to n/2 In order' for 
U. to be large. / ^ . ' ^ /r.:' 



From a purely mathematical standpoint, the above optimizing. condi- 
tion fort-ACp would require our actually computing AC^ for each potential 
Individual to be removed, for there could be a tradeoff between NCl. 

K 

being small and U. being large (I.e., |x. - n/2| being small). In 

-v> Iv ^ K 

practice, however. It Is highly unlikely that a person wlth*^a small 

NCI would have a middling score that could yield a large. U.. That is, 

practically everyone with a small NCI would have a relatively (extreme 

score, leading to a fairly small Uj^. Hence, the smallness of NCI 

becomes the overriding consideration In^electlng the Individual to 

be removed. It therefore suffices to compute AC In accordance with 

P 

Equation (^) for just those members of the group who have the few 

smallest values of NCI and select the one among them that yields the 

largest AC . 

P 

In the foregoing manner, we would successively remove the member 
of the Remaining group that produces the largest increase In the overall 
Cp (noting that U. and Cp(X|^) have to be recomputed at each step), 
until the value of C^ achieves a satisfactory target magnitude. 



DISCUSSION 



The subset of examinees (or items) extractable by the technique 
described in the preceding section or, indeed, by all earlier . 

methods, to our knowledge is uni dimensional (or neafrly so) in 

M 
■ -.1.1 

the scalabil ity or order-theoretic sense. On the otheV hand, the 
unidimerisionall ty of data required for satisfactory practical 
functioning of, and parameter estimation in, ICC mode|s^is more 

' i I 

closely related to that in the factor-analytic sense. | ISince it is. 
well known (e.g., Guttman, 1950; DuBofs, 1970) that s|a lability of 
a set of Items leads to a simplex and — depending on the distribution 
of difficulties of the Items -- may produce a correlation matrix with 
up to n/2 factors, it may seem meaningless to strive for unidimen- 
sional^ity in the sen'e of seal abi 1 i ty when the purpose is to improve the 
appl icabil ity of ICC models. S^!: V % 

Despite the foregoing circumstance^ experience has shown that 

■ - < • .1 • I 

when a set of Items approximates scalability In a |giveh group of 
examinees, the factorial structure al so becomes "cileahfer" and the 
estimabil ity of the latent trait parameter Is Imprpv^edL This is 
described in detail by Birenbaum and Tatsuoka (198o), who found that 

all these improvements scalabi 1 i ty, ; factorial determinancy and 

.. . . ■ ' V . 

estimabil ity of 9 — result simul tanecrusly when iteni scores are 

*Lord and Novick (1968, pp. 37^-'375) , show that tf]^ matrix of item 
tetrachonics having unit rank (when communal I ties la^re inserted in the 
diagonal) Is a sufficient but "very fair from being {a) necessary" 
condition for the assumption of local jlndependence of the items to hold. 



^2 

modified by assigning zeros to those Items that were deemed (by an 

elaborate system of error analsyls) to be correctly answered for 

wrong reasons. Thus, lmpr<§vlng unldlmenslonallty In the scalability 

/ 

sense — or, to put It another .^ay, removing examinees with aberrant 
response patterns " does enhance the pr^actlcal applicability of ICC 
models up to a certain point. But there are limits to the efficacy 
of this approach, which are discussed elsewhere (Tatsuoka & Tatsuoka, 
1980) • 

We no^ turn our attention to a couple of difficulties with the NCI 

that we have yet to resolve to our satisfaction. The first Is the \ 

excessively small (I.e., close to -1) value received by a student 

whose test score would have been perfe'ct except for his/her getting 

one or two very easy Items wrong by mistyping or some other clerical 
f 

error. For Instance, consider the response pattern (111111111101), 
whose^ NCI value Is -.8I8. Yet, this students getting the second 
easiest Item wrong \s. almost certainly due to a "random clerical error, 
and hence the response pattern should not be regarded as "extremely 
atypical" In the sense of Its Implying a serious misconception* In 
particular. It seems Incongruous that such a response pattern should 
be automatically deleted from the outsQ^ In the method for extraction 
of unldlmenslonal subsets discussed In thejfjrecedl ng section. Thus, 
this extreme sensitivity of the NCI to one or two '"happenstantlaT* 
zeros In a response pattern that would otherwise deceive a value of 
+1.0 Is an undesirable property so long as we adopt overall group 
consistency as the criterion for extracting unldlmenslonal subsets. 




A3 ^ 

)rtunately, however, the above defect of the NCI does not affect 
Its usefulness In the diagnostic procedure utilizing the fourfold 
table, based on the NCI, |Ci and score combination, displayed In Table 4. 
It will be recalled that, so long as the total score Is high, a student 
with low NCI will not be diagnosed as having serious problems even 
when the ICI Is high. It Is also seen from Table k that, before any 
diagnosis can be made on the basis of , the NCI's being high or low. It 
is essential to examine whether the total test score Is high or low. 

All In alii It appears that the ICI Is the more useful of the 
two indices for diagnostic purposes. Its drawback Is that it | 
requires the test to be constructed out of two or more parallel subtests. 
Alternatively^ we might say that, for achievement tests to perform as 
powerful diagnostic tools, they should incorporate several parallel 
subparts. 

It was pointed out earlier that Equation (2) falls when 
the test score Is either zero or perfect, making U = 0. Intuitively, 
Cp should have the value 1 in both these cases, since a response vector 
with all elements equal 0 and one with all elements equal 1 are both 
^^fiuttman vectors. This can be brought about in one of two ways: 
(a) by arbitrarily defining C^ = 1 when U = 0 despite the fact that 
Equation (2) does not apply in this case ~ much as we define 0! = 1 
even though the definitlort nl = 1»2«3 ♦ n does not make sense for 
n»0; or (b) by changing the very definition of Cp to read 



50 



2tU ♦!) /. 





t1 



Instead of - 2U /U - 1 as In Equatle|F#) 

Alternative (a) has the advantage of preserving the definition 
of Cliff's consistency Index C^.^, a^a weighted average 

, N • ■ ■ . ' * 

2 w,-NC|:, 



wlt^ 



N 



as stated In Equat l^^^j^^^^r any pattern with Uj « 0, Wj would be 
0 and hence the value NCiy « 1 would not enter Into the weighted 
average for computing C^^. This Is consistent with Cliff's definition, 
because perfect and all-zero respcffise patterns do not contribute 
anything to a dominance matrix, since no Item dominates any other item 
In such response patterns. 

On the other hand, alternative (a) has the disadvantage of rendering 
undefined the iCi for a student with perfect (all all-zero) response 
patterns on all of the parallel subtests, for the combining weights 
for all the (Individually perfect) C^'s would then take the Indeterminate 
form 0/0» Thus, It would require another definition by flat to give 
such a student's ICI the value of 1, which is what Is should be. 
Since all the respons'e patterns are Identical, 



51 



45 



By contrast, alternative (b) would avoid this difficulty. With 
the revised definition (5) for C^, the combining weight associated with 
the Cp of the J-th among a set of m response patterns would be 

U,+1 ' 



(6) . - — ^■ 

J m 



E U +m 
J-1 

Consequently I the combining weights used In calculating the ICI for 
a student who consistently shows perfect (or all-zero) response 
patterns on m parallel subsets would uniformly equal Wj » 1/m, Hence 
his/her ICI will now be 1.0, as It should be. On the other hand, 
alternative (a) would lead to an overall group consistency index that 
does not agree wlth^ Cliff's C since each NCI would no longer be 
a rinear transform of Ua/U, 

Thus, each of the alternatives for making NCI take the value 1,0 
for perfect and all -zero! response patterns has Its advantage and its 
disadvantage, and we have a dilemma. In view of^the more Important 
role played by the ICI compared to the NCI In error diagnosis, we are 
inclined to favor alternative (b). However, further investigation of 
other possible Implications carried by definition (5) for pattern 
conformity needs to be made before we make a definite commitment. 



REFERENCES 

Birenbaum M. & Tatsuoka, K,k, The use of Information from wrong response 
patterns (Technical Report 80-1). Urbana. 111.: University of 
Illinois, Computer-based Education Research Lab, 1980* 

Brown, J.S., & Burtbn, R.R. Dignostlc models for procedural bugs in 
basic mathematics skills. Cognitive Science , 1978, 2^, 155-192 

Cliff, N. A theory of consistency of ordering general izable to tailored 
testing. Psychometrika , 1977, 375-399.. ^ 

O.wis, R.B, The Madison Project discovery in mathematics: A text 
for teachers . Reading: Addison-Wesley, 196^, 

DuBois, P.H. Varieties of psychological test homogeneity. American Psycho- 
logist , 1970, 25. 532-536. 

Guttman, L.L, Studies In social psychology in World War II. In S.A. 
Stouffer CEd.), Measurement and Prediction, Vol. IV . Princeton: 
Princeton University Press, 1950. 

Krus, D.J. Order analysis of binary data matrices . Los Angeles: Theta 
Press, 1977. 

Lord, F.M., S Novick, M.R. Statistical theories of mental test scores . 

Reading: Addison-Wesley, 1968. 
Mokken, R.J. A theory and procedure of scale analysis: With applications 

in political reseda rch . The Hague: Mouton, 1971. 
Reynolds, T.J. The analysis of dominance matrices: Extraction of unidjmen 

sional contest (Technical Report No. 3). Los Angeles: University of 

Southern California, Department of Psychol ogy, June 1976. 
Tatsuoka, K.K. , S Birenbaum, M. The danger of relying solely on diagnosf^c 

testing when prior and subsequent instructional methods are different 

(Technical Report 79-2). Urbana, 111.: University of Illinois, 

Compute r-t)ased Education Research Laborotory, 1979. 



Tatsuoka, K^. The^^east'sguares estimation of latene trait variables . 

Paper presented at the 1979 Aoierlcan Educational Researcii Association. 

1 

Bos ton . 

Tatsuoka, K.K., & Tatsuoka, M.H. Consistent errors, dimensionality and 

1 oca I I ndependence CTectin leal Reoo r t 80-6) to be prepared. 
Tatuuoka, K.K., BIrenbaum, M. , Tatsuoka, M.M., & Ball lie, R. A psychometric 

approach to error a nalysis on response patterns (Technical Report 80-3 ). 

Urbana, III.: University of Illinois, Computer-based Education 

Research Lab, 1980. 
Vai^Der Flier, H. Environmental factors and dev lant .response patterns. 

In Poortinga (Ed.), Baste problems in crossi cul tura I psycholociY . 

Amsterdam: 1977, 

W^d, R.L., WIngersky, M.S., 6 Lord, F.M. LOGIST: A computer program 

for estimating examinee ability and' i tern characteristic curve parameters 
(Research Memorandum 76-6). Princeton, N.J.: Princeton Univeifsity, 
Educational Testing Service, 1976 (revised 1978). 

Yamamoto, Y. 6 Wise, S. Extracting unl dimensional chains from multi- 
dimensional datasets: A graph theory approach (Technical Report 80-2).' 
Urbana, III.: University of 1 1 1 i noi s , Computer-based Educat ion Research 
Lab, 1980. 



ERIC 



